Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition
نویسندگان
چکیده
A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density per triphone state the log-linear MMI optimization is a convex optimization problem, and by further splitting and discriminative training of this model we can get a higher complexity model. Previously it was shown that we achieve large gains in the objective function and corresponding moderate gains in the word error rate on a large vocabulary corpus. This paper incorporates the state of the art minimum phone error training criterion into the framework, and shows that after discriminative splitting, a subsequent log-linear MPE training achieves better results than Gaussian mixture model MPE optimization alone.
منابع مشابه
Discriminative training of stochastic Markov graphs for speech recognition
This paper proposes the application of discriminative training techniques based on the Generalized Probabilistic Descent (GPD) approach to Stochastic Markov Graphs (SMGs), a generalization of mixture-state Hidden Markov Models (HMMs), describing the constraints in the acoustic structure of speech as a graph consisting of nodes, each containing a base function, and a transition network between t...
متن کاملLarge Scale Discriminative Training for Speech Recognition
This paper describes, and evaluates on a large scale, the lattice based framework for discriminative training of large vocabulary speech recognition systems based on Gaussian mixture hidden Markov models (HMMs). The paper concentrates on the maximum mutual information estimation (MMIE) criterion which has been used to train HMM systems for conversational telephone speech transcription using up ...
متن کاملThe cascade HMM/ANN hybrid: A new framework for discriminative training in speech recognition
In this paper, a new formulation for discriminative training of HMMs is presented. This formulation uses a properly trained MLP in a simple interconnection with HMMs called “Cascade HMM/ANN Hybrid”. Our training algorithm has simple realization in comparison with other discriminative training for HMMs such as MDI and MMI. We also present a rigid mathematical proof of its convergence. We found t...
متن کاملLarge Margin Training of Acoustic Models for Speech Recognition
LARGE MARGIN TRAINING OF ACOUSTIC MODELS FOR SPEECH RECOGNITION Fei Sha Advisor: Prof. Lawrence K. Saul Automatic speech recognition (ASR) depends critically on building acoustic models for linguistic units. These acoustic models usually take the form of continuous-density hidden Markov models (CD-HMMs), whose parameters are obtained by maximum likelihood estimation. Recently, however, there ha...
متن کاملDiscriminative training of GMM-HMM acoustic model by RPCL learning
This paper presents a new discriminative approach for training Gaussian mixture models (GMMs) of hidden Markov models (HMMs) based acoustic model in a large vocabulary continuous speech recognition (LVCSR) system. This approach is featured by embedding a rival penalized competitive learning (RPCL) mechanism on the level of hidden Markov states. For every input, the correct identity state, calle...
متن کامل